Overview

Dataset statistics

Number of variables21
Number of observations21613
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory3.5 MiB
Average record size in memory168.0 B

Variable types

DateTime1
Numeric14
Categorical6

Alerts

price is highly correlated with sqft_living and 3 other fieldsHigh correlation
bedrooms is highly correlated with bathrooms and 2 other fieldsHigh correlation
bathrooms is highly correlated with bedrooms and 6 other fieldsHigh correlation
sqft_living is highly correlated with price and 5 other fieldsHigh correlation
sqft_lot is highly correlated with sqft_lot15High correlation
floors is highly correlated with bathrooms and 3 other fieldsHigh correlation
grade is highly correlated with price and 6 other fieldsHigh correlation
sqft_above is highly correlated with price and 6 other fieldsHigh correlation
yr_built is highly correlated with bathrooms and 2 other fieldsHigh correlation
sqft_living15 is highly correlated with price and 4 other fieldsHigh correlation
sqft_lot15 is highly correlated with sqft_lotHigh correlation
price is highly correlated with bathrooms and 4 other fieldsHigh correlation
bedrooms is highly correlated with bathrooms and 1 other fieldsHigh correlation
bathrooms is highly correlated with price and 7 other fieldsHigh correlation
sqft_living is highly correlated with price and 5 other fieldsHigh correlation
sqft_lot is highly correlated with sqft_lot15High correlation
floors is highly correlated with bathrooms and 1 other fieldsHigh correlation
grade is highly correlated with price and 4 other fieldsHigh correlation
sqft_above is highly correlated with price and 5 other fieldsHigh correlation
yr_built is highly correlated with bathroomsHigh correlation
sqft_living15 is highly correlated with price and 4 other fieldsHigh correlation
sqft_lot15 is highly correlated with sqft_lotHigh correlation
price is highly correlated with gradeHigh correlation
bedrooms is highly correlated with sqft_livingHigh correlation
bathrooms is highly correlated with sqft_living and 2 other fieldsHigh correlation
sqft_living is highly correlated with bedrooms and 4 other fieldsHigh correlation
sqft_lot is highly correlated with sqft_lot15High correlation
grade is highly correlated with price and 4 other fieldsHigh correlation
sqft_above is highly correlated with bathrooms and 3 other fieldsHigh correlation
sqft_living15 is highly correlated with sqft_living and 2 other fieldsHigh correlation
sqft_lot15 is highly correlated with sqft_lotHigh correlation
condition is highly correlated with condition_typeHigh correlation
view is highly correlated with waterfrontHigh correlation
condition_type is highly correlated with conditionHigh correlation
waterfront is highly correlated with viewHigh correlation
price is highly correlated with bathrooms and 6 other fieldsHigh correlation
bedrooms is highly correlated with bathrooms and 2 other fieldsHigh correlation
bathrooms is highly correlated with price and 8 other fieldsHigh correlation
sqft_living is highly correlated with price and 7 other fieldsHigh correlation
sqft_lot is highly correlated with sqft_lot15High correlation
floors is highly correlated with yr_builtHigh correlation
condition is highly correlated with yr_built and 1 other fieldsHigh correlation
grade is highly correlated with price and 5 other fieldsHigh correlation
sqft_above is highly correlated with price and 6 other fieldsHigh correlation
sqft_basement is highly correlated with price and 3 other fieldsHigh correlation
yr_built is highly correlated with bathrooms and 3 other fieldsHigh correlation
lat is highly correlated with price_tierHigh correlation
long is highly correlated with yr_builtHigh correlation
sqft_living15 is highly correlated with price and 5 other fieldsHigh correlation
sqft_lot15 is highly correlated with sqft_lotHigh correlation
dormitory_type is highly correlated with bathroomsHigh correlation
condition_type is highly correlated with conditionHigh correlation
price_tier is highly correlated with price and 4 other fieldsHigh correlation
sqft_basement has 13126 (60.7%) zeros Zeros

Reproduction

Analysis started2022-04-30 16:37:41.871710
Analysis finished2022-04-30 16:38:48.750728
Duration1 minute and 6.88 seconds
Software versionpandas-profiling v3.1.0
Download configurationconfig.json

Variables

date
Date

Distinct372
Distinct (%)1.7%
Missing0
Missing (%)0.0%
Memory size169.0 KiB
Minimum2014-05-02 00:00:00
Maximum2015-05-27 00:00:00
2022-04-30T11:38:49.003689image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-30T11:38:49.446690image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)

price
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct4028
Distinct (%)18.6%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean540088.1418
Minimum75000
Maximum7700000
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size169.0 KiB
2022-04-30T11:38:49.693727image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Quantile statistics

Minimum75000
5-th percentile210000
Q1321950
median450000
Q3645000
95-th percentile1156480
Maximum7700000
Range7625000
Interquartile range (IQR)323050

Descriptive statistics

Standard deviation367127.1965
Coefficient of variation (CV)0.6797542255
Kurtosis34.58554043
Mean540088.1418
Median Absolute Deviation (MAD)150000
Skewness4.024069145
Sum1.167292501 × 1010
Variance1.347823784 × 1011
MonotonicityNot monotonic
2022-04-30T11:38:49.937730image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
350000172
 
0.8%
450000172
 
0.8%
550000159
 
0.7%
500000152
 
0.7%
425000150
 
0.7%
325000148
 
0.7%
400000145
 
0.7%
375000138
 
0.6%
300000133
 
0.6%
525000131
 
0.6%
Other values (4018)20113
93.1%
ValueCountFrequency (%)
750001
< 0.1%
780001
< 0.1%
800001
< 0.1%
810001
< 0.1%
820001
< 0.1%
825001
< 0.1%
830001
< 0.1%
840001
< 0.1%
850002
< 0.1%
865001
< 0.1%
ValueCountFrequency (%)
77000001
< 0.1%
70625001
< 0.1%
68850001
< 0.1%
55700001
< 0.1%
53500001
< 0.1%
53000001
< 0.1%
51108001
< 0.1%
46680001
< 0.1%
45000001
< 0.1%
44890001
< 0.1%

bedrooms
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct13
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean3.370841623
Minimum0
Maximum33
Zeros13
Zeros (%)0.1%
Negative0
Negative (%)0.0%
Memory size169.0 KiB
2022-04-30T11:38:50.141701image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile2
Q13
median3
Q34
95-th percentile5
Maximum33
Range33
Interquartile range (IQR)1

Descriptive statistics

Standard deviation0.9300618311
Coefficient of variation (CV)0.2759138325
Kurtosis49.06365318
Mean3.370841623
Median Absolute Deviation (MAD)1
Skewness1.974299535
Sum72854
Variance0.8650150098
MonotonicityNot monotonic
2022-04-30T11:38:50.306692image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram with fixed size bins (bins=13)
ValueCountFrequency (%)
39824
45.5%
46882
31.8%
22760
 
12.8%
51601
 
7.4%
6272
 
1.3%
1199
 
0.9%
738
 
0.2%
013
 
0.1%
813
 
0.1%
96
 
< 0.1%
Other values (3)5
 
< 0.1%
ValueCountFrequency (%)
013
 
0.1%
1199
 
0.9%
22760
 
12.8%
39824
45.5%
46882
31.8%
51601
 
7.4%
6272
 
1.3%
738
 
0.2%
813
 
0.1%
96
 
< 0.1%
ValueCountFrequency (%)
331
 
< 0.1%
111
 
< 0.1%
103
 
< 0.1%
96
 
< 0.1%
813
 
0.1%
738
 
0.2%
6272
 
1.3%
51601
 
7.4%
46882
31.8%
39824
45.5%

bathrooms
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct30
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2.114757322
Minimum0
Maximum8
Zeros10
Zeros (%)< 0.1%
Negative0
Negative (%)0.0%
Memory size169.0 KiB
2022-04-30T11:38:50.488697image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile1
Q11.75
median2.25
Q32.5
95-th percentile3.5
Maximum8
Range8
Interquartile range (IQR)0.75

Descriptive statistics

Standard deviation0.7701631572
Coefficient of variation (CV)0.3641851238
Kurtosis1.279902444
Mean2.114757322
Median Absolute Deviation (MAD)0.5
Skewness0.5111075733
Sum45706.25
Variance0.5931512887
MonotonicityNot monotonic
2022-04-30T11:38:50.682695image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram with fixed size bins (bins=30)
ValueCountFrequency (%)
2.55380
24.9%
13852
17.8%
1.753048
14.1%
2.252047
 
9.5%
21930
 
8.9%
1.51446
 
6.7%
2.751185
 
5.5%
3753
 
3.5%
3.5731
 
3.4%
3.25589
 
2.7%
Other values (20)652
 
3.0%
ValueCountFrequency (%)
010
 
< 0.1%
0.54
 
< 0.1%
0.7572
 
0.3%
13852
17.8%
1.259
 
< 0.1%
1.51446
 
6.7%
1.753048
14.1%
21930
 
8.9%
2.252047
 
9.5%
2.55380
24.9%
ValueCountFrequency (%)
82
 
< 0.1%
7.751
 
< 0.1%
7.51
 
< 0.1%
6.752
 
< 0.1%
6.52
 
< 0.1%
6.252
 
< 0.1%
66
< 0.1%
5.754
 
< 0.1%
5.510
< 0.1%
5.2513
0.1%

sqft_living
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct1038
Distinct (%)4.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2079.899736
Minimum290
Maximum13540
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size169.0 KiB
2022-04-30T11:38:50.900689image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Quantile statistics

Minimum290
5-th percentile940
Q11427
median1910
Q32550
95-th percentile3760
Maximum13540
Range13250
Interquartile range (IQR)1123

Descriptive statistics

Standard deviation918.440897
Coefficient of variation (CV)0.4415794093
Kurtosis5.24309299
Mean2079.899736
Median Absolute Deviation (MAD)540
Skewness1.471555427
Sum44952873
Variance843533.6814
MonotonicityNot monotonic
2022-04-30T11:38:51.142693image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1300138
 
0.6%
1400135
 
0.6%
1440133
 
0.6%
1800129
 
0.6%
1660129
 
0.6%
1010129
 
0.6%
1820128
 
0.6%
1480125
 
0.6%
1720125
 
0.6%
1540124
 
0.6%
Other values (1028)20318
94.0%
ValueCountFrequency (%)
2901
< 0.1%
3701
< 0.1%
3801
< 0.1%
3841
< 0.1%
3902
< 0.1%
4101
< 0.1%
4202
< 0.1%
4301
< 0.1%
4401
< 0.1%
4601
< 0.1%
ValueCountFrequency (%)
135401
< 0.1%
120501
< 0.1%
100401
< 0.1%
98901
< 0.1%
96401
< 0.1%
92001
< 0.1%
86701
< 0.1%
80201
< 0.1%
80101
< 0.1%
80001
< 0.1%

sqft_lot
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct9782
Distinct (%)45.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean15106.96757
Minimum520
Maximum1651359
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size169.0 KiB
2022-04-30T11:38:51.382690image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Quantile statistics

Minimum520
5-th percentile1800
Q15040
median7618
Q310688
95-th percentile43339.2
Maximum1651359
Range1650839
Interquartile range (IQR)5648

Descriptive statistics

Standard deviation41420.51152
Coefficient of variation (CV)2.741815082
Kurtosis285.0778197
Mean15106.96757
Median Absolute Deviation (MAD)2618
Skewness13.06001896
Sum326506890
Variance1715658774
MonotonicityNot monotonic
2022-04-30T11:38:51.602693image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
5000358
 
1.7%
6000290
 
1.3%
4000251
 
1.2%
7200220
 
1.0%
4800120
 
0.6%
7500119
 
0.6%
4500114
 
0.5%
8400111
 
0.5%
9600109
 
0.5%
3600103
 
0.5%
Other values (9772)19818
91.7%
ValueCountFrequency (%)
5201
< 0.1%
5721
< 0.1%
6001
< 0.1%
6091
< 0.1%
6351
< 0.1%
6381
< 0.1%
6492
< 0.1%
6511
< 0.1%
6751
< 0.1%
6761
< 0.1%
ValueCountFrequency (%)
16513591
< 0.1%
11647941
< 0.1%
10742181
< 0.1%
10240681
< 0.1%
9829981
< 0.1%
9822781
< 0.1%
9204231
< 0.1%
8816541
< 0.1%
8712002
< 0.1%
8433091
< 0.1%

floors
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct6
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1.494308981
Minimum1
Maximum3.5
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size169.0 KiB
2022-04-30T11:38:51.788695image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q11
median1.5
Q32
95-th percentile2
Maximum3.5
Range2.5
Interquartile range (IQR)1

Descriptive statistics

Standard deviation0.5399888951
Coefficient of variation (CV)0.361363615
Kurtosis-0.4847229368
Mean1.494308981
Median Absolute Deviation (MAD)0.5
Skewness0.6161767212
Sum32296.5
Variance0.2915880069
MonotonicityNot monotonic
2022-04-30T11:38:51.967688image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram with fixed size bins (bins=6)
ValueCountFrequency (%)
110680
49.4%
28241
38.1%
1.51910
 
8.8%
3613
 
2.8%
2.5161
 
0.7%
3.58
 
< 0.1%
ValueCountFrequency (%)
110680
49.4%
1.51910
 
8.8%
28241
38.1%
2.5161
 
0.7%
3613
 
2.8%
3.58
 
< 0.1%
ValueCountFrequency (%)
3.58
 
< 0.1%
3613
 
2.8%
2.5161
 
0.7%
28241
38.1%
1.51910
 
8.8%
110680
49.4%

waterfront
Categorical

HIGH CORRELATION

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size169.0 KiB
0
21450 
1
 
163

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters0
Distinct characters0
Distinct categories0 ?
Distinct scripts0 ?
Distinct blocks0 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
021450
99.2%
1163
 
0.8%

Length

2022-04-30T11:38:52.142689image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2022-04-30T11:38:52.277215image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
ValueCountFrequency (%)
021450
99.2%
1163
 
0.8%

Most occurring characters

ValueCountFrequency (%)
No values found.

Most occurring categories

ValueCountFrequency (%)
No values found.

Most frequent character per category

Most occurring scripts

ValueCountFrequency (%)
No values found.

Most frequent character per script

Most occurring blocks

ValueCountFrequency (%)
No values found.

Most frequent character per block

view
Categorical

HIGH CORRELATION

Distinct5
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size169.0 KiB
0
19489 
2
 
963
3
 
510
1
 
332
4
 
319

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters0
Distinct characters0
Distinct categories0 ?
Distinct scripts0 ?
Distinct blocks0 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
019489
90.2%
2963
 
4.5%
3510
 
2.4%
1332
 
1.5%
4319
 
1.5%

Length

2022-04-30T11:38:52.418205image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2022-04-30T11:38:52.546204image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
ValueCountFrequency (%)
019489
90.2%
2963
 
4.5%
3510
 
2.4%
1332
 
1.5%
4319
 
1.5%

Most occurring characters

ValueCountFrequency (%)
No values found.

Most occurring categories

ValueCountFrequency (%)
No values found.

Most frequent character per category

Most occurring scripts

ValueCountFrequency (%)
No values found.

Most frequent character per script

Most occurring blocks

ValueCountFrequency (%)
No values found.

Most frequent character per block

condition
Categorical

HIGH CORRELATION
HIGH CORRELATION

Distinct5
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size169.0 KiB
3
14031 
4
5679 
5
1701 
2
 
172
1
 
30

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters0
Distinct characters0
Distinct categories0 ?
Distinct scripts0 ?
Distinct blocks0 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row3
2nd row3
3rd row3
4th row5
5th row3

Common Values

ValueCountFrequency (%)
314031
64.9%
45679
26.3%
51701
 
7.9%
2172
 
0.8%
130
 
0.1%

Length

2022-04-30T11:38:52.710207image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2022-04-30T11:38:52.820246image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
ValueCountFrequency (%)
314031
64.9%
45679
26.3%
51701
 
7.9%
2172
 
0.8%
130
 
0.1%

Most occurring characters

ValueCountFrequency (%)
No values found.

Most occurring categories

ValueCountFrequency (%)
No values found.

Most frequent character per category

Most occurring scripts

ValueCountFrequency (%)
No values found.

Most frequent character per script

Most occurring blocks

ValueCountFrequency (%)
No values found.

Most frequent character per block

grade
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct12
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean7.656873178
Minimum1
Maximum13
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size169.0 KiB
2022-04-30T11:38:52.957204image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile6
Q17
median7
Q38
95-th percentile10
Maximum13
Range12
Interquartile range (IQR)1

Descriptive statistics

Standard deviation1.175458757
Coefficient of variation (CV)0.1535168116
Kurtosis1.190932077
Mean7.656873178
Median Absolute Deviation (MAD)1
Skewness0.7711032008
Sum165488
Variance1.381703289
MonotonicityNot monotonic
2022-04-30T11:38:53.113205image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram with fixed size bins (bins=12)
ValueCountFrequency (%)
78981
41.6%
86068
28.1%
92615
 
12.1%
62038
 
9.4%
101134
 
5.2%
11399
 
1.8%
5242
 
1.1%
1290
 
0.4%
429
 
0.1%
1313
 
0.1%
Other values (2)4
 
< 0.1%
ValueCountFrequency (%)
11
 
< 0.1%
33
 
< 0.1%
429
 
0.1%
5242
 
1.1%
62038
 
9.4%
78981
41.6%
86068
28.1%
92615
 
12.1%
101134
 
5.2%
11399
 
1.8%
ValueCountFrequency (%)
1313
 
0.1%
1290
 
0.4%
11399
 
1.8%
101134
 
5.2%
92615
 
12.1%
86068
28.1%
78981
41.6%
62038
 
9.4%
5242
 
1.1%
429
 
0.1%

sqft_above
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct946
Distinct (%)4.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1788.390691
Minimum290
Maximum9410
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size169.0 KiB
2022-04-30T11:38:53.503203image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Quantile statistics

Minimum290
5-th percentile850
Q11190
median1560
Q32210
95-th percentile3400
Maximum9410
Range9120
Interquartile range (IQR)1020

Descriptive statistics

Standard deviation828.0909777
Coefficient of variation (CV)0.4630369538
Kurtosis3.402303621
Mean1788.390691
Median Absolute Deviation (MAD)450
Skewness1.446664473
Sum38652488
Variance685734.6673
MonotonicityNot monotonic
2022-04-30T11:38:53.903203image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1300212
 
1.0%
1010210
 
1.0%
1200206
 
1.0%
1220192
 
0.9%
1140184
 
0.9%
1400180
 
0.8%
1060178
 
0.8%
1180177
 
0.8%
1340176
 
0.8%
1250174
 
0.8%
Other values (936)19724
91.3%
ValueCountFrequency (%)
2901
< 0.1%
3701
< 0.1%
3801
< 0.1%
3841
< 0.1%
3902
< 0.1%
4101
< 0.1%
4202
< 0.1%
4301
< 0.1%
4401
< 0.1%
4601
< 0.1%
ValueCountFrequency (%)
94101
< 0.1%
88601
< 0.1%
85701
< 0.1%
80201
< 0.1%
78801
< 0.1%
78501
< 0.1%
76801
< 0.1%
74201
< 0.1%
73201
< 0.1%
67201
< 0.1%

sqft_basement
Real number (ℝ≥0)

HIGH CORRELATION
ZEROS

Distinct306
Distinct (%)1.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean291.5090455
Minimum0
Maximum4820
Zeros13126
Zeros (%)60.7%
Negative0
Negative (%)0.0%
Memory size169.0 KiB
2022-04-30T11:38:54.124203image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q3560
95-th percentile1190
Maximum4820
Range4820
Interquartile range (IQR)560

Descriptive statistics

Standard deviation442.5750427
Coefficient of variation (CV)1.518220616
Kurtosis2.715574211
Mean291.5090455
Median Absolute Deviation (MAD)0
Skewness1.577965056
Sum6300385
Variance195872.6684
MonotonicityNot monotonic
2022-04-30T11:38:54.369223image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
013126
60.7%
600221
 
1.0%
700218
 
1.0%
500214
 
1.0%
800206
 
1.0%
400184
 
0.9%
1000149
 
0.7%
900144
 
0.7%
300142
 
0.7%
200108
 
0.5%
Other values (296)6901
31.9%
ValueCountFrequency (%)
013126
60.7%
102
 
< 0.1%
201
 
< 0.1%
404
 
< 0.1%
5011
 
0.1%
6010
 
< 0.1%
651
 
< 0.1%
707
 
< 0.1%
8020
 
0.1%
9021
 
0.1%
ValueCountFrequency (%)
48201
< 0.1%
41301
< 0.1%
35001
< 0.1%
34801
< 0.1%
32601
< 0.1%
30001
< 0.1%
28501
< 0.1%
28101
< 0.1%
27301
< 0.1%
27201
< 0.1%

yr_built
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct116
Distinct (%)0.5%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1971.005136
Minimum1900
Maximum2015
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size169.0 KiB
2022-04-30T11:38:54.599247image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Quantile statistics

Minimum1900
5-th percentile1915
Q11951
median1975
Q31997
95-th percentile2011
Maximum2015
Range115
Interquartile range (IQR)46

Descriptive statistics

Standard deviation29.3734108
Coefficient of variation (CV)0.01490275711
Kurtosis-0.6574075047
Mean1971.005136
Median Absolute Deviation (MAD)23
Skewness-0.4698053988
Sum42599334
Variance862.7972622
MonotonicityNot monotonic
2022-04-30T11:38:54.821246image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
2014559
 
2.6%
2006454
 
2.1%
2005450
 
2.1%
2004433
 
2.0%
2003422
 
2.0%
2007417
 
1.9%
1977417
 
1.9%
1978387
 
1.8%
1968381
 
1.8%
2008367
 
1.7%
Other values (106)17326
80.2%
ValueCountFrequency (%)
190087
0.4%
190129
 
0.1%
190227
 
0.1%
190346
0.2%
190445
0.2%
190574
0.3%
190692
0.4%
190765
0.3%
190886
0.4%
190994
0.4%
ValueCountFrequency (%)
201538
 
0.2%
2014559
2.6%
2013201
 
0.9%
2012170
 
0.8%
2011130
 
0.6%
2010143
 
0.7%
2009230
1.1%
2008367
1.7%
2007417
1.9%
2006454
2.1%

lat
Real number (ℝ≥0)

HIGH CORRELATION

Distinct5034
Distinct (%)23.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean47.56005252
Minimum47.1559
Maximum47.7776
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size169.0 KiB
2022-04-30T11:38:55.051258image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Quantile statistics

Minimum47.1559
5-th percentile47.3103
Q147.471
median47.5718
Q347.678
95-th percentile47.74964
Maximum47.7776
Range0.6217
Interquartile range (IQR)0.207

Descriptive statistics

Standard deviation0.1385637102
Coefficient of variation (CV)0.002913447377
Kurtosis-0.6763130016
Mean47.56005252
Median Absolute Deviation (MAD)0.1049
Skewness-0.4852704765
Sum1027915.415
Variance0.0191999018
MonotonicityNot monotonic
2022-04-30T11:38:55.280246image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
47.662417
 
0.1%
47.532217
 
0.1%
47.684617
 
0.1%
47.549117
 
0.1%
47.695516
 
0.1%
47.688616
 
0.1%
47.671116
 
0.1%
47.540215
 
0.1%
47.684215
 
0.1%
47.690415
 
0.1%
Other values (5024)21452
99.3%
ValueCountFrequency (%)
47.15591
< 0.1%
47.15931
< 0.1%
47.16221
< 0.1%
47.16471
< 0.1%
47.17641
< 0.1%
47.17751
< 0.1%
47.17762
< 0.1%
47.17951
< 0.1%
47.18031
< 0.1%
47.18081
< 0.1%
ValueCountFrequency (%)
47.77763
< 0.1%
47.77753
< 0.1%
47.77741
 
< 0.1%
47.77723
< 0.1%
47.77712
 
< 0.1%
47.7772
 
< 0.1%
47.77693
< 0.1%
47.77682
 
< 0.1%
47.77676
< 0.1%
47.77664
< 0.1%

long
Real number (ℝ)

HIGH CORRELATION

Distinct752
Distinct (%)3.5%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean-122.2138964
Minimum-122.519
Maximum-121.315
Zeros0
Zeros (%)0.0%
Negative21613
Negative (%)100.0%
Memory size169.0 KiB
2022-04-30T11:38:55.515250image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Quantile statistics

Minimum-122.519
5-th percentile-122.387
Q1-122.328
median-122.23
Q3-122.125
95-th percentile-121.979
Maximum-121.315
Range1.204
Interquartile range (IQR)0.203

Descriptive statistics

Standard deviation0.1408283424
Coefficient of variation (CV)-0.001152310388
Kurtosis1.049500887
Mean-122.2138964
Median Absolute Deviation (MAD)0.101
Skewness0.8850529834
Sum-2641408.943
Variance0.01983262202
MonotonicityNot monotonic
2022-04-30T11:38:55.754244image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
-122.29116
 
0.5%
-122.3111
 
0.5%
-122.362104
 
0.5%
-122.291100
 
0.5%
-122.36399
 
0.5%
-122.37299
 
0.5%
-122.28898
 
0.5%
-122.35796
 
0.4%
-122.28495
 
0.4%
-122.36594
 
0.4%
Other values (742)20601
95.3%
ValueCountFrequency (%)
-122.5191
 
< 0.1%
-122.5151
 
< 0.1%
-122.5141
 
< 0.1%
-122.5121
 
< 0.1%
-122.5112
< 0.1%
-122.5092
< 0.1%
-122.5071
 
< 0.1%
-122.5061
 
< 0.1%
-122.5053
< 0.1%
-122.5042
< 0.1%
ValueCountFrequency (%)
-121.3152
< 0.1%
-121.3161
< 0.1%
-121.3191
< 0.1%
-121.3211
< 0.1%
-121.3251
< 0.1%
-121.3522
< 0.1%
-121.3591
< 0.1%
-121.3642
< 0.1%
-121.4021
< 0.1%
-121.4031
< 0.1%

sqft_living15
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct777
Distinct (%)3.6%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1986.552492
Minimum399
Maximum6210
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size169.0 KiB
2022-04-30T11:38:55.987207image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Quantile statistics

Minimum399
5-th percentile1140
Q11490
median1840
Q32360
95-th percentile3300
Maximum6210
Range5811
Interquartile range (IQR)870

Descriptive statistics

Standard deviation685.3913043
Coefficient of variation (CV)0.3450154512
Kurtosis1.59709581
Mean1986.552492
Median Absolute Deviation (MAD)410
Skewness1.108181276
Sum42935359
Variance469761.2399
MonotonicityNot monotonic
2022-04-30T11:38:56.208207image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1540197
 
0.9%
1440195
 
0.9%
1560192
 
0.9%
1500181
 
0.8%
1460169
 
0.8%
1580167
 
0.8%
1610166
 
0.8%
1720166
 
0.8%
1800166
 
0.8%
1620165
 
0.8%
Other values (767)19849
91.8%
ValueCountFrequency (%)
3991
 
< 0.1%
4602
 
< 0.1%
6202
 
< 0.1%
6701
 
< 0.1%
6902
 
< 0.1%
7002
 
< 0.1%
7102
 
< 0.1%
7202
 
< 0.1%
7408
< 0.1%
7503
 
< 0.1%
ValueCountFrequency (%)
62101
 
< 0.1%
61101
 
< 0.1%
57906
< 0.1%
56101
 
< 0.1%
56001
 
< 0.1%
55001
 
< 0.1%
53801
 
< 0.1%
53401
 
< 0.1%
53301
 
< 0.1%
52201
 
< 0.1%

sqft_lot15
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct8689
Distinct (%)40.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean12768.45565
Minimum651
Maximum871200
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size169.0 KiB
2022-04-30T11:38:56.444210image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Quantile statistics

Minimum651
5-th percentile1999.2
Q15100
median7620
Q310083
95-th percentile37062.8
Maximum871200
Range870549
Interquartile range (IQR)4983

Descriptive statistics

Standard deviation27304.17963
Coefficient of variation (CV)2.138408933
Kurtosis150.76311
Mean12768.45565
Median Absolute Deviation (MAD)2505
Skewness9.506743247
Sum275964632
Variance745518225.3
MonotonicityNot monotonic
2022-04-30T11:38:56.649242image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
5000427
 
2.0%
4000357
 
1.7%
6000289
 
1.3%
7200211
 
1.0%
4800145
 
0.7%
7500142
 
0.7%
8400116
 
0.5%
3600111
 
0.5%
4500111
 
0.5%
5100109
 
0.5%
Other values (8679)19595
90.7%
ValueCountFrequency (%)
6511
 
< 0.1%
6591
 
< 0.1%
6601
 
< 0.1%
7482
< 0.1%
7504
< 0.1%
7551
 
< 0.1%
7571
 
< 0.1%
7581
 
< 0.1%
7881
 
< 0.1%
7941
 
< 0.1%
ValueCountFrequency (%)
8712001
< 0.1%
8581321
< 0.1%
5606171
< 0.1%
4382131
< 0.1%
4347281
< 0.1%
4255811
< 0.1%
4229671
< 0.1%
4119621
< 0.1%
3920402
< 0.1%
3868121
< 0.1%

dormitory_type
Categorical

HIGH CORRELATION

Distinct3
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size169.0 KiB
house
18641 
apartment
2760 
studio
 
212

Length

Max length9
Median length5
Mean length5.520612594
Min length5

Characters and Unicode

Total characters0
Distinct characters0
Distinct categories0 ?
Distinct scripts0 ?
Distinct blocks0 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowhouse
2nd rowhouse
3rd rowapartment
4th rowhouse
5th rowhouse

Common Values

ValueCountFrequency (%)
house18641
86.2%
apartment2760
 
12.8%
studio212
 
1.0%

Length

2022-04-30T11:38:56.886203image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2022-04-30T11:38:57.030205image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
ValueCountFrequency (%)
house18641
86.2%
apartment2760
 
12.8%
studio212
 
1.0%

Most occurring characters

ValueCountFrequency (%)
No values found.

Most occurring categories

ValueCountFrequency (%)
No values found.

Most frequent character per category

Most occurring scripts

ValueCountFrequency (%)
No values found.

Most frequent character per script

Most occurring blocks

ValueCountFrequency (%)
No values found.

Most frequent character per block

condition_type
Categorical

HIGH CORRELATION
HIGH CORRELATION

Distinct3
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size169.0 KiB
regular
19710 
good
 
1701
bad
 
202

Length

Max length7
Median length7
Mean length6.726507195
Min length3

Characters and Unicode

Total characters0
Distinct characters0
Distinct categories0 ?
Distinct scripts0 ?
Distinct blocks0 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowregular
2nd rowregular
3rd rowregular
4th rowgood
5th rowregular

Common Values

ValueCountFrequency (%)
regular19710
91.2%
good1701
 
7.9%
bad202
 
0.9%

Length

2022-04-30T11:38:57.165243image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2022-04-30T11:38:57.307202image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
ValueCountFrequency (%)
regular19710
91.2%
good1701
 
7.9%
bad202
 
0.9%

Most occurring characters

ValueCountFrequency (%)
No values found.

Most occurring categories

ValueCountFrequency (%)
No values found.

Most frequent character per category

Most occurring scripts

ValueCountFrequency (%)
No values found.

Most frequent character per script

Most occurring blocks

ValueCountFrequency (%)
No values found.

Most frequent character per block

price_tier
Categorical

HIGH CORRELATION

Distinct4
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size169.0 KiB
tier 2
5460 
tier 1
5404 
tier 3
5376 
tier 4
5373 

Length

Max length6
Median length6
Mean length6
Min length6

Characters and Unicode

Total characters0
Distinct characters0
Distinct categories0 ?
Distinct scripts0 ?
Distinct blocks0 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowtier 1
2nd rowtier 3
3rd rowtier 1
4th rowtier 3
5th rowtier 3

Common Values

ValueCountFrequency (%)
tier 25460
25.3%
tier 15404
25.0%
tier 35376
24.9%
tier 45373
24.9%

Length

2022-04-30T11:38:57.424205image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2022-04-30T11:38:57.538205image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
ValueCountFrequency (%)
tier21613
50.0%
25460
 
12.6%
15404
 
12.5%
35376
 
12.4%
45373
 
12.4%

Most occurring characters

ValueCountFrequency (%)
No values found.

Most occurring categories

ValueCountFrequency (%)
No values found.

Most frequent character per category

Most occurring scripts

ValueCountFrequency (%)
No values found.

Most frequent character per script

Most occurring blocks

ValueCountFrequency (%)
No values found.

Most frequent character per block

Interactions

2022-04-30T11:38:43.482714image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-30T11:37:51.443340image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-30T11:37:55.010862image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-30T11:37:58.600863image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-30T11:38:02.613904image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-30T11:38:06.038860image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-30T11:38:09.243863image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-30T11:38:12.467417image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-30T11:38:16.027965image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-30T11:38:19.103664image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-30T11:38:23.417668image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-30T11:38:28.390188image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-30T11:38:34.545193image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-30T11:38:38.989719image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-30T11:38:43.729714image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-30T11:37:51.850335image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-30T11:37:55.228868image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-30T11:37:59.081865image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-30T11:38:02.873862image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-30T11:38:06.275863image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-30T11:38:09.490864image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-30T11:38:12.685382image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-30T11:38:16.246963image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-30T11:38:19.323662image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-30T11:38:23.912193image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-30T11:38:28.743191image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-30T11:38:34.855188image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-30T11:38:39.369715image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-30T11:38:43.982716image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-30T11:37:52.166338image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-30T11:37:55.437862image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-30T11:37:59.498868image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-30T11:38:03.094862image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-30T11:38:06.503862image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-30T11:38:09.701862image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-30T11:38:12.901914image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-30T11:38:16.455949image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-30T11:38:19.734666image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-30T11:38:24.396185image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-30T11:38:29.117193image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-30T11:38:35.151186image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-30T11:38:39.688722image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-30T11:38:44.259809image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-30T11:37:52.445331image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-30T11:37:55.668861image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-30T11:37:59.962864image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-30T11:38:03.327864image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-30T11:38:06.731862image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-30T11:38:09.931379image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-30T11:38:13.097923image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-30T11:38:16.675948image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-30T11:38:19.964670image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-30T11:38:24.731186image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-30T11:38:29.787188image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-30T11:38:35.481183image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-30T11:38:40.031713image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-30T11:38:44.512807image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-30T11:37:52.680336image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-30T11:37:55.890863image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-30T11:38:00.367863image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-30T11:38:03.571864image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-30T11:38:06.952901image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-30T11:38:10.146379image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-30T11:38:13.318912image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-30T11:38:16.911664image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-30T11:38:20.187663image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-30T11:38:25.072188image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-30T11:38:30.182189image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-30T11:38:35.840188image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-30T11:38:40.604713image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-30T11:38:44.781806image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-30T11:37:52.913860image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-30T11:37:56.119864image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-30T11:38:00.620903image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-30T11:38:03.802895image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-30T11:38:07.177865image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-30T11:38:10.369381image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-30T11:38:13.532912image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-30T11:38:17.141665image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-30T11:38:20.426666image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-30T11:38:25.386188image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-30T11:38:30.790188image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-30T11:38:36.178192image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-30T11:38:40.912715image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-30T11:38:45.046809image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-30T11:37:53.175862image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-30T11:37:56.336862image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-30T11:38:00.833864image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-30T11:38:04.017864image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-30T11:38:07.414865image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-30T11:38:10.583381image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-30T11:38:13.754911image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-30T11:38:17.357662image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-30T11:38:20.643689image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-30T11:38:25.695188image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-30T11:38:31.414196image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-30T11:38:36.496183image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-30T11:38:41.289714image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-30T11:38:45.279807image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-30T11:37:53.423865image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-30T11:37:56.522861image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-30T11:38:01.065863image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-30T11:38:04.232899image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-30T11:38:07.632860image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-30T11:38:10.787382image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-30T11:38:13.957910image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-30T11:38:17.637661image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-30T11:38:20.939669image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-30T11:38:26.028188image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-30T11:38:31.865189image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-30T11:38:36.795195image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-30T11:38:41.600717image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-30T11:38:45.514810image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-30T11:37:53.645861image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-30T11:37:56.729872image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-30T11:38:01.280862image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-30T11:38:04.446864image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-30T11:38:07.848864image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-30T11:38:10.995378image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-30T11:38:14.176969image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-30T11:38:17.841662image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-30T11:38:21.177669image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-30T11:38:26.335186image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-30T11:38:32.304190image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-30T11:38:37.098188image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-30T11:38:41.895717image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-30T11:38:45.736828image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-30T11:37:53.871861image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-30T11:37:56.946864image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-30T11:38:01.496869image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-30T11:38:04.686861image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-30T11:38:08.078862image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-30T11:38:11.188379image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-30T11:38:14.484964image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-30T11:38:18.055661image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-30T11:38:21.419663image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-30T11:38:26.651183image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-30T11:38:32.786187image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-30T11:38:37.381186image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-30T11:38:42.202715image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-30T11:38:45.984356image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-30T11:37:54.104899image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-30T11:37:57.167862image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-30T11:38:01.720865image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-30T11:38:04.940861image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-30T11:38:08.284861image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-30T11:38:11.408380image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-30T11:38:14.811972image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-30T11:38:18.266706image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-30T11:38:21.677667image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-30T11:38:26.981186image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-30T11:38:33.243189image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-30T11:38:37.684186image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-30T11:38:42.469724image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-30T11:38:46.321877image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-30T11:37:54.339862image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-30T11:37:57.398864image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-30T11:38:01.972862image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-30T11:38:05.177866image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-30T11:38:08.517860image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-30T11:38:11.633379image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-30T11:38:15.149966image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-30T11:38:18.471697image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-30T11:38:21.996663image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-30T11:38:27.379189image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-30T11:38:33.569186image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-30T11:38:38.001185image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-30T11:38:42.727715image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-30T11:38:46.714939image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-30T11:37:54.565863image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-30T11:37:57.597863image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-30T11:38:02.194862image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-30T11:38:05.401862image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-30T11:38:08.756861image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-30T11:38:11.861379image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-30T11:38:15.454966image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-30T11:38:18.673702image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-30T11:38:22.482667image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-30T11:38:27.694184image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-30T11:38:33.874185image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-30T11:38:38.306715image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-30T11:38:42.982714image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-30T11:38:47.038953image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-30T11:37:54.788865image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-30T11:37:58.212866image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-30T11:38:02.399862image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-30T11:38:05.825863image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-30T11:38:08.977866image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-30T11:38:12.068384image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-30T11:38:15.751962image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-30T11:38:18.902664image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-30T11:38:23.004669image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-30T11:38:28.054188image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-30T11:38:34.221189image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-30T11:38:38.635715image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-04-30T11:38:43.216714image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Correlations

2022-04-30T11:38:57.716206image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Spearman's ρ

The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.

To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
2022-04-30T11:38:58.264204image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Pearson's r

The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.

To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
2022-04-30T11:38:58.623204image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Kendall's τ

Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.

To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
2022-04-30T11:38:58.936205image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Cramér's V (φc)

Cramér's V is an association measure for nominal random variables. The coefficient ranges from 0 to 1, with 0 indicating independence and 1 indicating perfect association. The empirical estimators used for Cramér's V have been proved to be biased, even for large samples. We use a bias-corrected measure that has been proposed by Bergsma in 2013 that can be found here.
2022-04-30T11:38:59.190211image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Phik (φk)

Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.

Missing values

2022-04-30T11:38:47.647933image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
A simple visualization of nullity by column.
2022-04-30T11:38:48.444275image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

First rows

datepricebedroomsbathroomssqft_livingsqft_lotfloorswaterfrontviewconditiongradesqft_abovesqft_basementyr_builtlatlongsqft_living15sqft_lot15dormitory_typecondition_typeprice_tier
02014-10-13221900.0031.00118056501.00003711800195547.51-122.2613405650houseregulartier 1
12014-12-09538000.0032.25257072422.0000372170400195147.72-122.3216907639houseregulartier 3
22015-02-25180000.0021.00770100001.0000367700193347.74-122.2327208062apartmentregulartier 1
32014-12-09604000.0043.00196050001.0000571050910196547.52-122.3913605000housegoodtier 3
42015-02-18510000.0032.00168080801.00003816800198747.62-122.0518007503houseregulartier 3
52014-05-121225000.0044.5054201019301.000031138901530200147.66-122.004760101930houseregulartier 4
62014-06-27257500.0032.25171568192.00003717150199547.31-122.3322386819houseregulartier 1
72015-01-15291850.0031.50106097111.00003710600196347.41-122.3116509711houseregulartier 1
82015-04-15229500.0031.00178074701.0000371050730196047.51-122.3417808113houseregulartier 1
92015-03-12323000.0032.50189065602.00003718900200347.37-122.0323907570houseregulartier 2

Last rows

datepricebedroomsbathroomssqft_livingsqft_lotfloorswaterfrontviewconditiongradesqft_abovesqft_basementyr_builtlatlongsqft_living15sqft_lot15dormitory_typecondition_typeprice_tier
216032014-08-25507250.0032.50227055362.00003822700200347.54-121.8822705731houseregulartier 3
216042015-01-26429000.0032.00149011263.00003814900201447.57-122.2914001230houseregulartier 2
216052014-10-14610685.0042.50252060232.00003925200201447.51-122.1725206023houseregulartier 3
216062015-03-261007500.0043.50351072002.0000392600910200947.55-122.4020506200houseregulartier 4
216072015-02-19475000.0032.50131012942.0000381180130200847.58-122.4113301265houseregulartier 3
216082014-05-21360000.0032.50153011313.00003815300200947.70-122.3515301509houseregulartier 2
216092015-02-23400000.0042.50231058132.00003823100201447.51-122.3618307200houseregulartier 2
216102014-06-23402101.0020.75102013502.00003710200200947.59-122.3010202007apartmentregulartier 2
216112015-01-16400000.0032.50160023882.00003816000200447.53-122.0714101287houseregulartier 2
216122014-10-15325000.0020.75102010762.00003710200200847.59-122.3010201357apartmentregulartier 2